CSE 255 - Assignment

نویسنده

  • Ali Ghorbani
چکیده

In this assignment we studied Google Local’s Maps and Restaurants data. The goal was to extract restaurants’ information from the dataset and to study how restaurants perform based on different features offered by the dataset. We studied three main features and how they affect a restaurant to stay in business; one was related to the geographical grouping of the restaurants, the other was how users’ reviews affect a business and finally how users’ ratings were involved. We used these three main features to predict whether a restaurant would stay in business or it would be closed. There were 3747937 users, 3114353 places, and 11453845 reviews in this dataset. From the businesses in the dataset 3014137 were marked as open and 100215 were closed which means that 3.32% of the businesses mentioned in the dataset were marked as closed. The place with the most number of reviews was ”Eiffel Tower” with 1662 reviews. The number of reviews formed a sharp long-tail with only 126 places (0.0040% of the places) having 300 or more reviews and the rest with fewer reviews.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CSE 255: Assignment 1 - Exploring Musical Tagging

We explore two predictive tasks: (i) a measure of tag probability, and (ii) identifying a minimum tag set for more meaningful music classification on a 100,000 song dataset joined across complementary databases from the 1 Million Song Dataset (“MSD”). We conclude that a tag set size of around 50 tags is most meaningful and report many of our findings/analysis based on the top 50 tags. Using lin...

متن کامل

CSE 255 Assignment 2 Cuisine Prediction/Classification based on ingredients

In this paper, we consider different strategies for identifying the cuisine, given its ingredients. This project aims to explore what combination of ingredients is helpful in identifying a cuisine if the recipe is not given. This has been tackled as a problem of cuisine classification. We also explore different classification algorithms in tandem with approaches like taking combination of multi...

متن کامل

CSE 255 Assignment 1: Helpfulness in Amazon Reviews

In this paper we consider models for predicting the helpfulness rating of Amazon book reviews. We examine features such as the review’s star rating, the length of the review text, the readability of the review text, and the amount of comparisons made in the review. We compare Support Vector Machine and Random Forests models both for regression and classification.

متن کامل

CSE 255 Assignment 2 : Upvotes Prediction for Reddit Submissions

In this paper we consider models for predicting the number of upvotes on a reddit submission. We examine features such as the number of votes, number of comments, time of submission, upvote history of users, images, and subreddits of the submission. We compare Support Vector Regression, Linear Regression, and Gradient Boosting Regression models for predicting the number of upvotes.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015